Dataset statistics
| Number of variables | 26 |
|---|---|
| Number of observations | 226763 |
| Missing cells | 1652364 |
| Missing cells (%) | 28.0% |
| Duplicate rows | 655 |
| Duplicate rows (%) | 0.3% |
| Total size in memory | 45.0 MiB |
| Average record size in memory | 208.0 B |
Variable types
| DateTime | 2 |
|---|---|
| Numeric | 2 |
| Categorical | 19 |
| Text | 3 |
| Dataset has 655 (0.3%) duplicate rows | Duplicates |
GARGANTA is highly overall correlated with DOR_ABD and 2 other fields | High correlation |
DISPNEIA is highly overall correlated with DESC_RESP | High correlation |
DESC_RESP is highly overall correlated with DISPNEIA and 1 other fields | High correlation |
SATURACAO is highly overall correlated with DESC_RESP | High correlation |
DIARREIA is highly overall correlated with VOMITO | High correlation |
VOMITO is highly overall correlated with DIARREIA | High correlation |
DOR_ABD is highly overall correlated with GARGANTA and 3 other fields | High correlation |
FADIGA is highly overall correlated with DOR_ABD | High correlation |
PERD_OLFT is highly overall correlated with GARGANTA and 2 other fields | High correlation |
PERD_PALA is highly overall correlated with GARGANTA and 2 other fields | High correlation |
ESTRANG is highly imbalanced (93.6%) | Imbalance |
CS_ZONA is highly imbalanced (73.3%) | Imbalance |
TOSSE is highly imbalanced (56.6%) | Imbalance |
DIARREIA is highly imbalanced (63.1%) | Imbalance |
VOMITO is highly imbalanced (54.4%) | Imbalance |
DOR_ABD is highly imbalanced (63.8%) | Imbalance |
PERD_OLFT is highly imbalanced (76.2%) | Imbalance |
PERD_PALA is highly imbalanced (76.5%) | Imbalance |
ESTRANG has 18945 (8.4%) missing values | Missing |
CS_ZONA has 19427 (8.6%) missing values | Missing |
OUT_ANIM has 226032 (99.7%) missing values | Missing |
FEBRE has 32100 (14.2%) missing values | Missing |
TOSSE has 20150 (8.9%) missing values | Missing |
GARGANTA has 63540 (28.0%) missing values | Missing |
DISPNEIA has 29836 (13.2%) missing values | Missing |
DESC_RESP has 36846 (16.2%) missing values | Missing |
SATURACAO has 41637 (18.4%) missing values | Missing |
DIARREIA has 64895 (28.6%) missing values | Missing |
VOMITO has 63050 (27.8%) missing values | Missing |
DOR_ABD has 66861 (29.5%) missing values | Missing |
FADIGA has 62595 (27.6%) missing values | Missing |
PERD_OLFT has 68822 (30.3%) missing values | Missing |
PERD_PALA has 69072 (30.5%) missing values | Missing |
OUTRO_SIN has 67384 (29.7%) missing values | Missing |
OUTRO_DES has 163149 (71.9%) missing values | Missing |
VACINA has 106570 (47.0%) missing values | Missing |
DOSE_2REF has 189217 (83.4%) missing values | Missing |
CLASSI_FIN has 17353 (7.7%) missing values | Missing |
CLASSI_OUT has 224773 (99.1%) missing values | Missing |
Reproduction
| Analysis started | 2023-11-01 17:17:05.055421 |
|---|---|
| Analysis finished | 2023-11-01 17:17:56.551385 |
| Duration | 51.5 seconds |
| Software version | ydata-profiling vv4.6.1 |
| Download configuration | config.json |
DT_SIN_PRI
Date
| Distinct | 288 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| Minimum | 2023-01-01 00:00:00 |
|---|---|
| Maximum | 2023-12-10 00:00:00 |
SEM_PRI
Real number (ℝ)
| Distinct | 42 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 19.334662 |
| Minimum | 1 |
|---|---|
| Maximum | 42 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.7 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 11 |
| median | 19 |
| Q3 | 27 |
| 95-th percentile | 37 |
| Maximum | 42 |
| Range | 41 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 10.44513 |
|---|---|
| Coefficient of variation (CV) | 0.54022823 |
| Kurtosis | -0.88824396 |
| Mean | 19.334662 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.17118062 |
| Sum | 4384386 |
| Variance | 109.10075 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 20 | 8436 | 3.7% |
| 18 | 7926 | 3.5% |
| 21 | 7705 | 3.4% |
| 19 | 7644 | 3.4% |
| 12 | 7622 | 3.4% |
| 15 | 7582 | 3.3% |
| 16 | 7335 | 3.2% |
| 13 | 7250 | 3.2% |
| 22 | 7117 | 3.1% |
| 11 | 7109 | 3.1% |
| Other values (32) | 151037 |
| Value | Count | Frequency (%) |
| 1 | 5767 | |
| 2 | 4550 | |
| 3 | 4039 | |
| 4 | 3648 | |
| 5 | 4216 | |
| 6 | 5172 | |
| 7 | 5835 | |
| 8 | 5975 | |
| 9 | 6419 | |
| 10 | 6726 |
| Value | Count | Frequency (%) |
| 42 | 6 | < 0.1% |
| 41 | 789 | 0.3% |
| 40 | 2961 | |
| 39 | 3440 | |
| 38 | 4030 | |
| 37 | 4071 | |
| 36 | 3901 | |
| 35 | 4102 | |
| 34 | 4426 | |
| 33 | 4443 |
ESTRANG
Categorical
IMBALANCE  MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 18945 |
| Missing (%) | 8.4% |
| Memory size | 1.7 MiB |
| 2.0 | |
|---|---|
| 1.0 | 1563 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 623454 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 2.0 |
| 3rd row | 2.0 |
| 4th row | 2.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 2.0 | 206255 | |
| 1.0 | 1563 | 0.7% |
| (Missing) | 18945 | 8.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.0 | 206255 | |
| 1.0 | 1563 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 207818 | |
| 0 | 207818 | |
| 2 | 206255 | |
| 1 | 1563 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 415636 | |
| Other Punctuation | 207818 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 207818 | |
| 2 | 206255 | |
| 1 | 1563 | 0.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 207818 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 623454 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 207818 | |
| 0 | 207818 | |
| 2 | 206255 | |
| 1 | 1563 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 623454 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 207818 | |
| 0 | 207818 | |
| 2 | 206255 | |
| 1 | 1563 | 0.3% |
NU_IDADE_N
Real number (ℝ)
| Distinct | 121 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 31.029171 |
| Minimum | -5 |
|---|---|
| Maximum | 123 |
| Zeros | 1402 |
| Zeros (%) | 0.6% |
| Negative | 3 |
| Negative (%) | < 0.1% |
| Memory size | 1.7 MiB |
Quantile statistics
| Minimum | -5 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 9 |
| Q3 | 65 |
| 95-th percentile | 87 |
| Maximum | 123 |
| Range | 128 |
| Interquartile range (IQR) | 62 |
Descriptive statistics
| Standard deviation | 32.817006 |
|---|---|
| Coefficient of variation (CV) | 1.0576179 |
| Kurtosis | -1.3002122 |
| Mean | 31.029171 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.6170725 |
| Sum | 7036268 |
| Variance | 1076.9559 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 26456 | 11.7% |
| 2 | 18114 | 8.0% |
| 3 | 14738 | 6.5% |
| 4 | 12882 | 5.7% |
| 5 | 10834 | 4.8% |
| 6 | 9071 | 4.0% |
| 7 | 7754 | 3.4% |
| 8 | 6661 | 2.9% |
| 9 | 5695 | 2.5% |
| 10 | 5037 | 2.2% |
| Other values (111) | 109521 |
| Value | Count | Frequency (%) |
| -5 | 1 | < 0.1% |
| -3 | 1 | < 0.1% |
| -2 | 1 | < 0.1% |
| 0 | 1402 | 0.6% |
| 1 | 26456 | |
| 2 | 18114 | |
| 3 | 14738 | |
| 4 | 12882 | |
| 5 | 10834 | |
| 6 | 9071 | 4.0% |
| Value | Count | Frequency (%) |
| 123 | 2 | |
| 117 | 1 | < 0.1% |
| 115 | 1 | < 0.1% |
| 114 | 1 | < 0.1% |
| 113 | 3 | |
| 112 | 1 | < 0.1% |
| 111 | 3 | |
| 110 | 4 | |
| 109 | 3 | |
| 108 | 2 |
SG_UF
Categorical
| Distinct | 27 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 54 |
| Missing (%) | < 0.1% |
| Memory size | 1.7 MiB |
| SP | |
|---|---|
| PR | |
| MG | |
| RJ | |
| RS | |
| Other values (22) |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 453418 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | MG |
|---|---|
| 2nd row | RJ |
| 3rd row | SP |
| 4th row | SP |
| 5th row | SP |
Common Values
| Value | Count | Frequency (%) |
| SP | 64239 | |
| PR | 22660 | 10.0% |
| MG | 19724 | 8.7% |
| RJ | 15533 | 6.8% |
| RS | 12034 | 5.3% |
| CE | 11382 | 5.0% |
| SC | 9669 | 4.3% |
| BA | 9130 | 4.0% |
| DF | 8045 | 3.5% |
| PE | 7340 | 3.2% |
| Other values (17) | 46953 |
Length
| Value | Count | Frequency (%) |
| sp | 64239 | |
| pr | 22660 | 10.0% |
| mg | 19724 | 8.7% |
| rj | 15533 | 6.9% |
| rs | 12034 | 5.3% |
| ce | 11382 | 5.0% |
| sc | 9669 | 4.3% |
| ba | 9130 | 4.0% |
| df | 8045 | 3.5% |
| pe | 7340 | 3.2% |
| Other values (17) | 46953 |
Most occurring characters
| Value | Count | Frequency (%) |
| P | 105077 | |
| S | 98389 | |
| R | 55268 | |
| M | 33592 | 7.4% |
| G | 27046 | 6.0% |
| E | 24773 | 5.5% |
| A | 24046 | 5.3% |
| C | 23179 | 5.1% |
| J | 15533 | 3.4% |
| B | 12873 | 2.8% |
| Other values (7) | 33642 | 7.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 453418 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 105077 | |
| S | 98389 | |
| R | 55268 | |
| M | 33592 | 7.4% |
| G | 27046 | 6.0% |
| E | 24773 | 5.5% |
| A | 24046 | 5.3% |
| C | 23179 | 5.1% |
| J | 15533 | 3.4% |
| B | 12873 | 2.8% |
| Other values (7) | 33642 | 7.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 453418 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| P | 105077 | |
| S | 98389 | |
| R | 55268 | |
| M | 33592 | 7.4% |
| G | 27046 | 6.0% |
| E | 24773 | 5.5% |
| A | 24046 | 5.3% |
| C | 23179 | 5.1% |
| J | 15533 | 3.4% |
| B | 12873 | 2.8% |
| Other values (7) | 33642 | 7.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 453418 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| P | 105077 | |
| S | 98389 | |
| R | 55268 | |
| M | 33592 | 7.4% |
| G | 27046 | 6.0% |
| E | 24773 | 5.5% |
| A | 24046 | 5.3% |
| C | 23179 | 5.1% |
| J | 15533 | 3.4% |
| B | 12873 | 2.8% |
| Other values (7) | 33642 | 7.4% |
CS_ZONA
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 19427 |
| Missing (%) | 8.6% |
| Memory size | 1.7 MiB |
| 1.0 | |
|---|---|
| 2.0 | 12793 |
| 9.0 | 3079 |
| 3.0 | 2398 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 622008 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 189066 | |
| 2.0 | 12793 | 5.6% |
| 9.0 | 3079 | 1.4% |
| 3.0 | 2398 | 1.1% |
| (Missing) | 19427 | 8.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 189066 | |
| 2.0 | 12793 | 6.2% |
| 9.0 | 3079 | 1.5% |
| 3.0 | 2398 | 1.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 207336 | |
| 0 | 207336 | |
| 1 | 189066 | |
| 2 | 12793 | 2.1% |
| 9 | 3079 | 0.5% |
| 3 | 2398 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 414672 | |
| Other Punctuation | 207336 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 207336 | |
| 1 | 189066 | |
| 2 | 12793 | 3.1% |
| 9 | 3079 | 0.7% |
| 3 | 2398 | 0.6% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 207336 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 622008 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 207336 | |
| 0 | 207336 | |
| 1 | 189066 | |
| 2 | 12793 | 2.1% |
| 9 | 3079 | 0.5% |
| 3 | 2398 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 622008 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 207336 | |
| 0 | 207336 | |
| 1 | 189066 | |
| 2 | 12793 | 2.1% |
| 9 | 3079 | 0.5% |
| 3 | 2398 | 0.4% |
OUT_ANIM
Text
MISSING 
| Distinct | 115 |
|---|---|
| Distinct (%) | 15.7% |
| Missing | 226032 |
| Missing (%) | 99.7% |
| Memory size | 1.7 MiB |
Length
| Max length | 33 |
|---|---|
| Median length | 31 |
| Mean length | 8.8604651 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6477 |
|---|---|
| Distinct characters | 32 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 82 ? |
|---|---|
| Unique (%) | 11.2% |
Sample
| 1st row | GATO E CACHORRO |
|---|---|
| 2nd row | CACHORRO |
| 3rd row | CACHORRO |
| 4th row | GATO |
| 5th row | CACHORRO |
| Value | Count | Frequency (%) |
| cachorro | 445 | |
| gato | 182 | |
| e | 76 | 7.7% |
| cao | 38 | 3.8% |
| gatos | 29 | 2.9% |
| gato,cachorro | 14 | 1.4% |
| de | 12 | 1.2% |
| cachorros | 10 | 1.0% |
| estimacao | 10 | 1.0% |
| animais | 9 | 0.9% |
| Other values (89) | 163 | 16.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| O | 1379 | |
| C | 1057 | |
| R | 1002 | |
| A | 949 | |
| H | 507 | 7.8% |
| T | 278 | 4.3% |
| G | 276 | 4.3% |
| 258 | 4.0% | |
| E | 165 | 2.5% |
| S | 116 | 1.8% |
| Other values (22) | 490 | 7.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 6117 | |
| Space Separator | 258 | 4.0% |
| Other Punctuation | 97 | 1.5% |
| Dash Punctuation | 2 | < 0.1% |
| Decimal Number | 2 | < 0.1% |
| Math Symbol | 1 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| O | 1379 | |
| C | 1057 | |
| R | 1002 | |
| A | 949 | |
| H | 507 | 8.3% |
| T | 278 | 4.5% |
| G | 276 | 4.5% |
| E | 165 | 2.7% |
| S | 116 | 1.9% |
| I | 103 | 1.7% |
| Other values (13) | 285 | 4.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 67 | |
| / | 14 | 14.4% |
| . | 13 | 13.4% |
| ; | 3 | 3.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 1 | |
| 1 | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 258 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6117 | |
| Common | 360 | 5.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| O | 1379 | |
| C | 1057 | |
| R | 1002 | |
| A | 949 | |
| H | 507 | 8.3% |
| T | 278 | 4.5% |
| G | 276 | 4.5% |
| E | 165 | 2.7% |
| S | 116 | 1.9% |
| I | 103 | 1.7% |
| Other values (13) | 285 | 4.7% |
Common
| Value | Count | Frequency (%) |
| 258 | ||
| , | 67 | 18.6% |
| / | 14 | 3.9% |
| . | 13 | 3.6% |
| ; | 3 | 0.8% |
| - | 2 | 0.6% |
| 3 | 1 | 0.3% |
| 1 | 1 | 0.3% |
| + | 1 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6477 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| O | 1379 | |
| C | 1057 | |
| R | 1002 | |
| A | 949 | |
| H | 507 | 7.8% |
| T | 278 | 4.3% |
| G | 276 | 4.3% |
| 258 | 4.0% | |
| E | 165 | 2.5% |
| S | 116 | 1.8% |
| Other values (22) | 490 | 7.6% |
FEBRE
Categorical
MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 32100 |
| Missing (%) | 14.2% |
| Memory size | 1.7 MiB |
| 1.0 | |
|---|---|
| 2.0 | |
| 9.0 | 1422 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 583989 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 2.0 |
| 4th row | 1.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 128752 | |
| 2.0 | 64489 | |
| 9.0 | 1422 | 0.6% |
| (Missing) | 32100 | 14.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 128752 | |
| 2.0 | 64489 | |
| 9.0 | 1422 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 194663 | |
| 0 | 194663 | |
| 1 | 128752 | |
| 2 | 64489 | 11.0% |
| 9 | 1422 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 389326 | |
| Other Punctuation | 194663 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 194663 | |
| 1 | 128752 | |
| 2 | 64489 | 16.6% |
| 9 | 1422 | 0.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 194663 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 583989 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 194663 | |
| 0 | 194663 | |
| 1 | 128752 | |
| 2 | 64489 | 11.0% |
| 9 | 1422 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 583989 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 194663 | |
| 0 | 194663 | |
| 1 | 128752 | |
| 2 | 64489 | 11.0% |
| 9 | 1422 | 0.2% |
TOSSE
Categorical
IMBALANCE  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 20150 |
| Missing (%) | 8.9% |
| Memory size | 1.7 MiB |
| 1.0 | |
|---|---|
| 2.0 | |
| 9.0 | 1018 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 619839 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 1.0 |
| 4th row | 2.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 171597 | |
| 2.0 | 33998 | 15.0% |
| 9.0 | 1018 | 0.4% |
| (Missing) | 20150 | 8.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 171597 | |
| 2.0 | 33998 | 16.5% |
| 9.0 | 1018 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 206613 | |
| 0 | 206613 | |
| 1 | 171597 | |
| 2 | 33998 | 5.5% |
| 9 | 1018 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 413226 | |
| Other Punctuation | 206613 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 206613 | |
| 1 | 171597 | |
| 2 | 33998 | 8.2% |
| 9 | 1018 | 0.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 206613 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 619839 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 206613 | |
| 0 | 206613 | |
| 1 | 171597 | |
| 2 | 33998 | 5.5% |
| 9 | 1018 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 619839 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 206613 | |
| 0 | 206613 | |
| 1 | 171597 | |
| 2 | 33998 | 5.5% |
| 9 | 1018 | 0.2% |
GARGANTA
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 63540 |
| Missing (%) | 28.0% |
| Memory size | 1.7 MiB |
| 2.0 | |
|---|---|
| 1.0 | |
| 9.0 | 5009 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 489669 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 2.0 |
| 4th row | 2.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 2.0 | 133476 | |
| 1.0 | 24738 | 10.9% |
| 9.0 | 5009 | 2.2% |
| (Missing) | 63540 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.0 | 133476 | |
| 1.0 | 24738 | 15.2% |
| 9.0 | 5009 | 3.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 163223 | |
| 0 | 163223 | |
| 2 | 133476 | |
| 1 | 24738 | 5.1% |
| 9 | 5009 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 326446 | |
| Other Punctuation | 163223 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 163223 | |
| 2 | 133476 | |
| 1 | 24738 | 7.6% |
| 9 | 5009 | 1.5% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 163223 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 489669 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 163223 | |
| 0 | 163223 | |
| 2 | 133476 | |
| 1 | 24738 | 5.1% |
| 9 | 5009 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 489669 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 163223 | |
| 0 | 163223 | |
| 2 | 133476 | |
| 1 | 24738 | 5.1% |
| 9 | 5009 | 1.0% |
DISPNEIA
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 29836 |
| Missing (%) | 13.2% |
| Memory size | 1.7 MiB |
| 1.0 | |
|---|---|
| 2.0 | |
| 9.0 | 1342 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 590781 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 146911 | |
| 2.0 | 48674 | 21.5% |
| 9.0 | 1342 | 0.6% |
| (Missing) | 29836 | 13.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 146911 | |
| 2.0 | 48674 | 24.7% |
| 9.0 | 1342 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 196927 | |
| 0 | 196927 | |
| 1 | 146911 | |
| 2 | 48674 | 8.2% |
| 9 | 1342 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 393854 | |
| Other Punctuation | 196927 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 196927 | |
| 1 | 146911 | |
| 2 | 48674 | 12.4% |
| 9 | 1342 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 196927 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 590781 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 196927 | |
| 0 | 196927 | |
| 1 | 146911 | |
| 2 | 48674 | 8.2% |
| 9 | 1342 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 590781 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 196927 | |
| 0 | 196927 | |
| 1 | 146911 | |
| 2 | 48674 | 8.2% |
| 9 | 1342 | 0.2% |
DESC_RESP
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 36846 |
| Missing (%) | 16.2% |
| Memory size | 1.7 MiB |
| 1.0 | |
|---|---|
| 2.0 | |
| 9.0 | 1222 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 569751 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 2.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 138790 | |
| 2.0 | 49905 | 22.0% |
| 9.0 | 1222 | 0.5% |
| (Missing) | 36846 | 16.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 138790 | |
| 2.0 | 49905 | 26.3% |
| 9.0 | 1222 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 189917 | |
| 0 | 189917 | |
| 1 | 138790 | |
| 2 | 49905 | 8.8% |
| 9 | 1222 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 379834 | |
| Other Punctuation | 189917 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 189917 | |
| 1 | 138790 | |
| 2 | 49905 | 13.1% |
| 9 | 1222 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 189917 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 569751 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 189917 | |
| 0 | 189917 | |
| 1 | 138790 | |
| 2 | 49905 | 8.8% |
| 9 | 1222 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 569751 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 189917 | |
| 0 | 189917 | |
| 1 | 138790 | |
| 2 | 49905 | 8.8% |
| 9 | 1222 | 0.2% |
SATURACAO
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 41637 |
| Missing (%) | 18.4% |
| Memory size | 1.7 MiB |
| 1.0 | |
|---|---|
| 2.0 | |
| 9.0 | 1671 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 555378 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 119698 | |
| 2.0 | 63757 | |
| 9.0 | 1671 | 0.7% |
| (Missing) | 41637 | 18.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 119698 | |
| 2.0 | 63757 | |
| 9.0 | 1671 | 0.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 185126 | |
| 0 | 185126 | |
| 1 | 119698 | |
| 2 | 63757 | 11.5% |
| 9 | 1671 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 370252 | |
| Other Punctuation | 185126 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 185126 | |
| 1 | 119698 | |
| 2 | 63757 | 17.2% |
| 9 | 1671 | 0.5% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 185126 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 555378 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 185126 | |
| 0 | 185126 | |
| 1 | 119698 | |
| 2 | 63757 | 11.5% |
| 9 | 1671 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 555378 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 185126 | |
| 0 | 185126 | |
| 1 | 119698 | |
| 2 | 63757 | 11.5% |
| 9 | 1671 | 0.3% |
DIARREIA
Categorical
HIGH CORRELATION  IMBALANCE  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 64895 |
| Missing (%) | 28.6% |
| Memory size | 1.7 MiB |
| 2.0 | |
|---|---|
| 1.0 | |
| 9.0 | 2110 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 485604 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 2.0 |
| 4th row | 2.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 2.0 | 142670 | |
| 1.0 | 17088 | 7.5% |
| 9.0 | 2110 | 0.9% |
| (Missing) | 64895 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.0 | 142670 | |
| 1.0 | 17088 | 10.6% |
| 9.0 | 2110 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 161868 | |
| 0 | 161868 | |
| 2 | 142670 | |
| 1 | 17088 | 3.5% |
| 9 | 2110 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 323736 | |
| Other Punctuation | 161868 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 161868 | |
| 2 | 142670 | |
| 1 | 17088 | 5.3% |
| 9 | 2110 | 0.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 161868 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 485604 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 161868 | |
| 0 | 161868 | |
| 2 | 142670 | |
| 1 | 17088 | 3.5% |
| 9 | 2110 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 485604 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 161868 | |
| 0 | 161868 | |
| 2 | 142670 | |
| 1 | 17088 | 3.5% |
| 9 | 2110 | 0.4% |
VOMITO
Categorical
HIGH CORRELATION  IMBALANCE  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 63050 |
| Missing (%) | 27.8% |
| Memory size | 1.7 MiB |
| 2.0 | |
|---|---|
| 1.0 | |
| 9.0 | 2117 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 491139 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 2.0 |
| 4th row | 2.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 2.0 | 135994 | |
| 1.0 | 25602 | 11.3% |
| 9.0 | 2117 | 0.9% |
| (Missing) | 63050 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.0 | 135994 | |
| 1.0 | 25602 | 15.6% |
| 9.0 | 2117 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 163713 | |
| 0 | 163713 | |
| 2 | 135994 | |
| 1 | 25602 | 5.2% |
| 9 | 2117 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 327426 | |
| Other Punctuation | 163713 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 163713 | |
| 2 | 135994 | |
| 1 | 25602 | 7.8% |
| 9 | 2117 | 0.6% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 163713 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 491139 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 163713 | |
| 0 | 163713 | |
| 2 | 135994 | |
| 1 | 25602 | 5.2% |
| 9 | 2117 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 491139 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 163713 | |
| 0 | 163713 | |
| 2 | 135994 | |
| 1 | 25602 | 5.2% |
| 9 | 2117 | 0.4% |
DOR_ABD
Categorical
HIGH CORRELATION  IMBALANCE  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 66861 |
| Missing (%) | 29.5% |
| Memory size | 1.7 MiB |
| 2.0 | |
|---|---|
| 1.0 | 12551 |
| 9.0 | 4348 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 479706 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 2.0 |
| 4th row | 2.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 2.0 | 143003 | |
| 1.0 | 12551 | 5.5% |
| 9.0 | 4348 | 1.9% |
| (Missing) | 66861 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.0 | 143003 | |
| 1.0 | 12551 | 7.8% |
| 9.0 | 4348 | 2.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 159902 | |
| 0 | 159902 | |
| 2 | 143003 | |
| 1 | 12551 | 2.6% |
| 9 | 4348 | 0.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 319804 | |
| Other Punctuation | 159902 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 159902 | |
| 2 | 143003 | |
| 1 | 12551 | 3.9% |
| 9 | 4348 | 1.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 159902 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 479706 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 159902 | |
| 0 | 159902 | |
| 2 | 143003 | |
| 1 | 12551 | 2.6% |
| 9 | 4348 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 479706 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 159902 | |
| 0 | 159902 | |
| 2 | 143003 | |
| 1 | 12551 | 2.6% |
| 9 | 4348 | 0.9% |
FADIGA
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 62595 |
| Missing (%) | 27.6% |
| Memory size | 1.7 MiB |
| 2.0 | |
|---|---|
| 1.0 | |
| 9.0 | 3586 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 492504 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 2.0 |
| 4th row | 2.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 2.0 | 125302 | |
| 1.0 | 35280 | 15.6% |
| 9.0 | 3586 | 1.6% |
| (Missing) | 62595 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.0 | 125302 | |
| 1.0 | 35280 | 21.5% |
| 9.0 | 3586 | 2.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 164168 | |
| 0 | 164168 | |
| 2 | 125302 | |
| 1 | 35280 | 7.2% |
| 9 | 3586 | 0.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 328336 | |
| Other Punctuation | 164168 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 164168 | |
| 2 | 125302 | |
| 1 | 35280 | 10.7% |
| 9 | 3586 | 1.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 164168 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 492504 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 164168 | |
| 0 | 164168 | |
| 2 | 125302 | |
| 1 | 35280 | 7.2% |
| 9 | 3586 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 492504 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 164168 | |
| 0 | 164168 | |
| 2 | 125302 | |
| 1 | 35280 | 7.2% |
| 9 | 3586 | 0.7% |
PERD_OLFT
Categorical
HIGH CORRELATION  IMBALANCE  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 68822 |
| Missing (%) | 30.3% |
| Memory size | 1.7 MiB |
| 2.0 | |
|---|---|
| 9.0 | 6906 |
| 1.0 | 2555 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 473823 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 2.0 |
| 4th row | 2.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 2.0 | 148480 | |
| 9.0 | 6906 | 3.0% |
| 1.0 | 2555 | 1.1% |
| (Missing) | 68822 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.0 | 148480 | |
| 9.0 | 6906 | 4.4% |
| 1.0 | 2555 | 1.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 157941 | |
| 0 | 157941 | |
| 2 | 148480 | |
| 9 | 6906 | 1.5% |
| 1 | 2555 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 315882 | |
| Other Punctuation | 157941 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 157941 | |
| 2 | 148480 | |
| 9 | 6906 | 2.2% |
| 1 | 2555 | 0.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 157941 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 473823 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 157941 | |
| 0 | 157941 | |
| 2 | 148480 | |
| 9 | 6906 | 1.5% |
| 1 | 2555 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 473823 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 157941 | |
| 0 | 157941 | |
| 2 | 148480 | |
| 9 | 6906 | 1.5% |
| 1 | 2555 | 0.5% |
PERD_PALA
Categorical
HIGH CORRELATION  IMBALANCE  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 69072 |
| Missing (%) | 30.5% |
| Memory size | 1.7 MiB |
| 2.0 | |
|---|---|
| 9.0 | 6894 |
| 1.0 | 2399 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 473073 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 2.0 |
| 4th row | 2.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 2.0 | 148398 | |
| 9.0 | 6894 | 3.0% |
| 1.0 | 2399 | 1.1% |
| (Missing) | 69072 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.0 | 148398 | |
| 9.0 | 6894 | 4.4% |
| 1.0 | 2399 | 1.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 157691 | |
| 0 | 157691 | |
| 2 | 148398 | |
| 9 | 6894 | 1.5% |
| 1 | 2399 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 315382 | |
| Other Punctuation | 157691 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 157691 | |
| 2 | 148398 | |
| 9 | 6894 | 2.2% |
| 1 | 2399 | 0.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 157691 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 473073 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 157691 | |
| 0 | 157691 | |
| 2 | 148398 | |
| 9 | 6894 | 1.5% |
| 1 | 2399 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 473073 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 157691 | |
| 0 | 157691 | |
| 2 | 148398 | |
| 9 | 6894 | 1.5% |
| 1 | 2399 | 0.5% |
OUTRO_SIN
Categorical
MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 67384 |
| Missing (%) | 29.7% |
| Memory size | 1.7 MiB |
| 2.0 | |
|---|---|
| 1.0 | |
| 9.0 | 3156 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 478137 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 2.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 2.0 | 91667 | |
| 1.0 | 64556 | |
| 9.0 | 3156 | 1.4% |
| (Missing) | 67384 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.0 | 91667 | |
| 1.0 | 64556 | |
| 9.0 | 3156 | 2.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 159379 | |
| 0 | 159379 | |
| 2 | 91667 | |
| 1 | 64556 | |
| 9 | 3156 | 0.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 318758 | |
| Other Punctuation | 159379 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 159379 | |
| 2 | 91667 | |
| 1 | 64556 | |
| 9 | 3156 | 1.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 159379 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 478137 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 159379 | |
| 0 | 159379 | |
| 2 | 91667 | |
| 1 | 64556 | |
| 9 | 3156 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 478137 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 159379 | |
| 0 | 159379 | |
| 2 | 91667 | |
| 1 | 64556 | |
| 9 | 3156 | 0.7% |
OUTRO_DES
Text
MISSING 
| Distinct | 19018 |
|---|---|
| Distinct (%) | 29.9% |
| Missing | 163149 |
| Missing (%) | 71.9% |
| Memory size | 1.7 MiB |
Length
| Max length | 60 |
|---|---|
| Median length | 54 |
| Mean length | 14.280567 |
| Min length | 1 |
Characters and Unicode
| Total characters | 908444 |
|---|---|
| Distinct characters | 64 |
| Distinct categories | 12 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 16465 ? |
|---|---|
| Unique (%) | 25.9% |
Sample
| 1st row | FRAQUEZA,MAL ESTAR,MIALGIA |
|---|---|
| 2nd row | CORIZA |
| 3rd row | CORIZA |
| 4th row | CORIZA |
| 5th row | LESOES BOLHOSAS+HIPEREMIA |
| Value | Count | Frequency (%) |
| coriza | 20888 | 17.6% |
| nasal | 5951 | 5.0% |
| dor | 4379 | 3.7% |
| cefaleia | 4147 | 3.5% |
| e | 3298 | 2.8% |
| congestao | 3262 | 2.8% |
| mialgia | 2855 | 2.4% |
| toracica | 2001 | 1.7% |
| de | 1955 | 1.7% |
| inapetencia | 1849 | 1.6% |
| Other values (9571) | 67836 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 136700 | |
| I | 91509 | |
| O | 87376 | |
| C | 70125 | 7.7% |
| R | 67558 | 7.4% |
| E | 67252 | 7.4% |
| 55059 | 6.1% | |
| S | 53146 | 5.9% |
| N | 44331 | 4.9% |
| T | 36626 | 4.0% |
| Other values (54) | 198762 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 824957 | |
| Space Separator | 55071 | 6.1% |
| Other Punctuation | 26093 | 2.9% |
| Decimal Number | 1199 | 0.1% |
| Math Symbol | 876 | 0.1% |
| Dash Punctuation | 154 | < 0.1% |
| Open Punctuation | 45 | < 0.1% |
| Close Punctuation | 42 | < 0.1% |
| Other Symbol | 3 | < 0.1% |
| Other Number | 2 | < 0.1% |
| Other values (2) | 2 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 136700 | |
| I | 91509 | |
| O | 87376 | |
| C | 70125 | |
| R | 67558 | |
| E | 67252 | |
| S | 53146 | 6.4% |
| N | 44331 | 5.4% |
| T | 36626 | 4.4% |
| L | 29780 | 3.6% |
| Other values (16) | 140554 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 20097 | |
| . | 2354 | 9.0% |
| / | 1937 | 7.4% |
| " | 848 | 3.2% |
| ; | 669 | 2.6% |
| % | 68 | 0.3% |
| * | 47 | 0.2% |
| ? | 36 | 0.1% |
| : | 26 | 0.1% |
| ' | 6 | < 0.1% |
| Other values (2) | 5 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 189 | |
| 0 | 158 | |
| 4 | 156 | |
| 3 | 148 | |
| 5 | 127 | |
| 8 | 111 | |
| 6 | 111 | |
| 9 | 76 | |
| 1 | 67 | 5.6% |
| 7 | 56 | 4.7% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 863 | |
| | | 4 | 0.5% |
| ~ | 3 | 0.3% |
| = | 3 | 0.3% |
| < | 2 | 0.2% |
| > | 1 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 55059 | ||
| Â | 12 | < 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 41 | |
| ] | 1 | 2.4% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 154 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 45 |
Other Symbol
| Value | Count | Frequency (%) |
| ° | 3 |
Other Number
| Value | Count | Frequency (%) |
| ² | 2 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ^ | 1 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 824957 | |
| Common | 83487 | 9.2% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 55059 | ||
| , | 20097 | 24.1% |
| . | 2354 | 2.8% |
| / | 1937 | 2.3% |
| + | 863 | 1.0% |
| " | 848 | 1.0% |
| ; | 669 | 0.8% |
| 2 | 189 | 0.2% |
| 0 | 158 | 0.2% |
| 4 | 156 | 0.2% |
| Other values (28) | 1157 | 1.4% |
Latin
| Value | Count | Frequency (%) |
| A | 136700 | |
| I | 91509 | |
| O | 87376 | |
| C | 70125 | |
| R | 67558 | |
| E | 67252 | |
| S | 53146 | 6.4% |
| N | 44331 | 5.4% |
| T | 36626 | 4.4% |
| L | 29780 | 3.6% |
| Other values (16) | 140554 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 908425 | |
| None | 19 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 136700 | |
| I | 91509 | |
| O | 87376 | |
| C | 70125 | 7.7% |
| R | 67558 | 7.4% |
| E | 67252 | 7.4% |
| 55059 | 6.1% | |
| S | 53146 | 5.9% |
| N | 44331 | 4.9% |
| T | 36626 | 4.0% |
| Other values (50) | 198743 |
None
| Value | Count | Frequency (%) |
| Â | 12 | |
| ° | 3 | 15.8% |
| ² | 2 | 10.5% |
| ¿ | 2 | 10.5% |
VACINA
Categorical
MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 106570 |
| Missing (%) | 47.0% |
| Memory size | 1.7 MiB |
| 9.0 | |
|---|---|
| 2.0 | |
| 1.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 360579 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 9.0 |
| 3rd row | 2.0 |
| 4th row | 1.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 9.0 | 59522 | |
| 2.0 | 45950 | |
| 1.0 | 14721 | 6.5% |
| (Missing) | 106570 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 9.0 | 59522 | |
| 2.0 | 45950 | |
| 1.0 | 14721 | 12.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 120193 | |
| 0 | 120193 | |
| 9 | 59522 | |
| 2 | 45950 | 12.7% |
| 1 | 14721 | 4.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 240386 | |
| Other Punctuation | 120193 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 120193 | |
| 9 | 59522 | |
| 2 | 45950 | 19.1% |
| 1 | 14721 | 6.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 120193 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 360579 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 120193 | |
| 0 | 120193 | |
| 9 | 59522 | |
| 2 | 45950 | 12.7% |
| 1 | 14721 | 4.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 360579 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 120193 | |
| 0 | 120193 | |
| 9 | 59522 | |
| 2 | 45950 | 12.7% |
| 1 | 14721 | 4.1% |
VACINA_COV
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 56 |
| Missing (%) | < 0.1% |
| Memory size | 1.7 MiB |
| 1.0 | |
|---|---|
| 2.0 | |
| 9.0 | 4287 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 680121 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 114017 | |
| 2.0 | 108403 | |
| 9.0 | 4287 | 1.9% |
| (Missing) | 56 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 114017 | |
| 2.0 | 108403 | |
| 9.0 | 4287 | 1.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 226707 | |
| 0 | 226707 | |
| 1 | 114017 | |
| 2 | 108403 | |
| 9 | 4287 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 453414 | |
| Other Punctuation | 226707 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 226707 | |
| 1 | 114017 | |
| 2 | 108403 | |
| 9 | 4287 | 0.9% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 226707 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 680121 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 226707 | |
| 0 | 226707 | |
| 1 | 114017 | |
| 2 | 108403 | |
| 9 | 4287 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 680121 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 226707 | |
| 0 | 226707 | |
| 1 | 114017 | |
| 2 | 108403 | |
| 9 | 4287 | 0.6% |
DOSE_2REF
Date
MISSING 
| Distinct | 462 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 189217 |
| Missing (%) | 83.4% |
| Memory size | 1.7 MiB |
| Minimum | 2021-01-03 00:00:00 |
|---|---|
| Maximum | 2023-12-04 00:00:00 |
CLASSI_FIN
Categorical
MISSING 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 17353 |
| Missing (%) | 7.7% |
| Memory size | 1.7 MiB |
| 4.0 | |
|---|---|
| 2.0 | |
| 5.0 | |
| 1.0 | |
| 3.0 | 2747 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 628230 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 4.0 |
|---|---|
| 2nd row | 4.0 |
| 3rd row | 5.0 |
| 4th row | 4.0 |
| 5th row | 4.0 |
Common Values
| Value | Count | Frequency (%) |
| 4.0 | 118732 | |
| 2.0 | 40115 | 17.7% |
| 5.0 | 34688 | 15.3% |
| 1.0 | 13128 | 5.8% |
| 3.0 | 2747 | 1.2% |
| (Missing) | 17353 | 7.7% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 4.0 | 118732 | |
| 2.0 | 40115 | 19.2% |
| 5.0 | 34688 | 16.6% |
| 1.0 | 13128 | 6.3% |
| 3.0 | 2747 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 209410 | |
| 0 | 209410 | |
| 4 | 118732 | |
| 2 | 40115 | 6.4% |
| 5 | 34688 | 5.5% |
| 1 | 13128 | 2.1% |
| 3 | 2747 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 418820 | |
| Other Punctuation | 209410 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 209410 | |
| 4 | 118732 | |
| 2 | 40115 | 9.6% |
| 5 | 34688 | 8.3% |
| 1 | 13128 | 3.1% |
| 3 | 2747 | 0.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 209410 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 628230 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 209410 | |
| 0 | 209410 | |
| 4 | 118732 | |
| 2 | 40115 | 6.4% |
| 5 | 34688 | 5.5% |
| 1 | 13128 | 2.1% |
| 3 | 2747 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 628230 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 209410 | |
| 0 | 209410 | |
| 4 | 118732 | |
| 2 | 40115 | 6.4% |
| 5 | 34688 | 5.5% |
| 1 | 13128 | 2.1% |
| 3 | 2747 | 0.4% |
CLASSI_OUT
Text
MISSING 
| Distinct | 485 |
|---|---|
| Distinct (%) | 24.4% |
| Missing | 224773 |
| Missing (%) | 99.1% |
| Memory size | 1.7 MiB |
Length
| Max length | 30 |
|---|---|
| Median length | 27 |
| Mean length | 14.062814 |
| Min length | 2 |
Characters and Unicode
| Total characters | 27985 |
|---|---|
| Distinct characters | 46 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 370 ? |
|---|---|
| Unique (%) | 18.6% |
Sample
| 1st row | MICOBACTERIA TUBERCULOSE |
|---|---|
| 2nd row | TUBERCULOSE |
| 3rd row | CRISE ASMATICA |
| 4th row | BRONQUITE |
| 5th row | DISPNEIA |
| Value | Count | Frequency (%) |
| pneumonia | 759 | |
| bacteriana | 144 | 4.5% |
| bronquiolite | 101 | 3.1% |
| tuberculose | 101 | 3.1% |
| virus | 96 | 3.0% |
| sincicial | 92 | 2.9% |
| asma | 89 | 2.8% |
| pnm | 83 | 2.6% |
| respiratorio | 75 | 2.3% |
| 58 | 1.8% | |
| Other values (472) | 1626 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 2867 | |
| N | 2827 | |
| I | 2800 | |
| E | 2555 | 9.1% |
| O | 2437 | 8.7% |
| U | 1969 | 7.0% |
| R | 1645 | 5.9% |
| P | 1512 | 5.4% |
| M | 1441 | 5.1% |
| C | 1353 | 4.8% |
| Other values (36) | 6579 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 26504 | |
| Space Separator | 1240 | 4.4% |
| Other Punctuation | 147 | 0.5% |
| Decimal Number | 46 | 0.2% |
| Dash Punctuation | 26 | 0.1% |
| Math Symbol | 19 | 0.1% |
| Close Punctuation | 2 | < 0.1% |
| Open Punctuation | 1 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 2867 | |
| N | 2827 | |
| I | 2800 | |
| E | 2555 | |
| O | 2437 | |
| U | 1969 | 7.4% |
| R | 1645 | 6.2% |
| P | 1512 | 5.7% |
| M | 1441 | 5.4% |
| C | 1353 | 5.1% |
| Other values (16) | 5098 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 13 | |
| 3 | 8 | |
| 8 | 6 | |
| 0 | 5 | 10.9% |
| 6 | 4 | 8.7% |
| 4 | 4 | 8.7% |
| 9 | 3 | 6.5% |
| 5 | 3 | 6.5% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 59 | |
| , | 48 | |
| . | 35 | |
| ? | 3 | 2.0% |
| ; | 2 | 1.4% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 18 | |
| = | 1 | 5.3% |
Close Punctuation
| Value | Count | Frequency (%) |
| ] | 1 | |
| ) | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 1240 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 26 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 26504 | |
| Common | 1481 | 5.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 2867 | |
| N | 2827 | |
| I | 2800 | |
| E | 2555 | |
| O | 2437 | |
| U | 1969 | 7.4% |
| R | 1645 | 6.2% |
| P | 1512 | 5.7% |
| M | 1441 | 5.4% |
| C | 1353 | 5.1% |
| Other values (16) | 5098 |
Common
| Value | Count | Frequency (%) |
| 1240 | ||
| / | 59 | 4.0% |
| , | 48 | 3.2% |
| . | 35 | 2.4% |
| - | 26 | 1.8% |
| + | 18 | 1.2% |
| 1 | 13 | 0.9% |
| 3 | 8 | 0.5% |
| 8 | 6 | 0.4% |
| 0 | 5 | 0.3% |
| Other values (10) | 23 | 1.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 27985 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 2867 | |
| N | 2827 | |
| I | 2800 | |
| E | 2555 | 9.1% |
| O | 2437 | 8.7% |
| U | 1969 | 7.0% |
| R | 1645 | 5.9% |
| P | 1512 | 5.4% |
| M | 1441 | 5.1% |
| C | 1353 | 4.8% |
| Other values (36) | 6579 |
| SEM_PRI | NU_IDADE_N | ESTRANG | SG_UF | CS_ZONA | FEBRE | TOSSE | GARGANTA | DISPNEIA | DESC_RESP | SATURACAO | DIARREIA | VOMITO | DOR_ABD | FADIGA | PERD_OLFT | PERD_PALA | OUTRO_SIN | VACINA | VACINA_COV | CLASSI_FIN | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SEM_PRI | 1.000 | -0.073 | 0.009 | 0.077 | 0.041 | 0.048 | 0.064 | 0.023 | 0.026 | 0.045 | 0.022 | 0.018 | 0.014 | 0.019 | 0.026 | 0.028 | 0.028 | 0.023 | 0.067 | 0.070 | 0.137 |
| NU_IDADE_N | -0.073 | 1.000 | 0.050 | 0.075 | 0.042 | 0.179 | 0.163 | 0.091 | 0.056 | 0.100 | 0.087 | 0.028 | 0.085 | 0.040 | 0.115 | 0.065 | 0.060 | 0.039 | 0.093 | 0.487 | 0.236 |
| ESTRANG | 0.009 | 0.050 | 1.000 | 0.113 | 0.012 | 0.000 | 0.008 | 0.004 | 0.001 | 0.007 | 0.000 | 0.007 | 0.003 | 0.000 | 0.006 | 0.000 | 0.000 | 0.000 | 0.013 | 0.021 | 0.019 |
| SG_UF | 0.077 | 0.075 | 0.113 | 1.000 | 0.253 | 0.135 | 0.094 | 0.143 | 0.116 | 0.087 | 0.149 | 0.067 | 0.081 | 0.081 | 0.128 | 0.086 | 0.090 | 0.164 | 0.187 | 0.135 | 0.157 |
| CS_ZONA | 0.041 | 0.042 | 0.012 | 0.253 | 1.000 | 0.034 | 0.028 | 0.022 | 0.035 | 0.035 | 0.024 | 0.008 | 0.011 | 0.016 | 0.019 | 0.009 | 0.011 | 0.039 | 0.073 | 0.051 | 0.061 |
| FEBRE | 0.048 | 0.179 | 0.000 | 0.135 | 0.034 | 1.000 | 0.435 | 0.281 | 0.287 | 0.329 | 0.284 | 0.389 | 0.386 | 0.275 | 0.287 | 0.229 | 0.230 | 0.204 | 0.043 | 0.120 | 0.087 |
| TOSSE | 0.064 | 0.163 | 0.008 | 0.094 | 0.028 | 0.435 | 1.000 | 0.296 | 0.334 | 0.352 | 0.280 | 0.383 | 0.376 | 0.274 | 0.291 | 0.231 | 0.231 | 0.213 | 0.048 | 0.114 | 0.104 |
| GARGANTA | 0.023 | 0.091 | 0.004 | 0.143 | 0.022 | 0.281 | 0.296 | 1.000 | 0.311 | 0.272 | 0.255 | 0.356 | 0.353 | 0.518 | 0.423 | 0.506 | 0.501 | 0.232 | 0.077 | 0.077 | 0.083 |
| DISPNEIA | 0.026 | 0.056 | 0.001 | 0.116 | 0.035 | 0.287 | 0.334 | 0.311 | 1.000 | 0.510 | 0.429 | 0.357 | 0.343 | 0.339 | 0.310 | 0.280 | 0.278 | 0.176 | 0.039 | 0.010 | 0.065 |
| DESC_RESP | 0.045 | 0.100 | 0.007 | 0.087 | 0.035 | 0.329 | 0.352 | 0.272 | 0.510 | 1.000 | 0.505 | 0.408 | 0.402 | 0.302 | 0.350 | 0.256 | 0.257 | 0.206 | 0.040 | 0.081 | 0.094 |
| SATURACAO | 0.022 | 0.087 | 0.000 | 0.149 | 0.024 | 0.284 | 0.280 | 0.255 | 0.429 | 0.505 | 1.000 | 0.370 | 0.365 | 0.277 | 0.316 | 0.236 | 0.240 | 0.208 | 0.070 | 0.035 | 0.053 |
| DIARREIA | 0.018 | 0.028 | 0.007 | 0.067 | 0.008 | 0.389 | 0.383 | 0.356 | 0.357 | 0.408 | 0.370 | 1.000 | 0.639 | 0.453 | 0.438 | 0.351 | 0.350 | 0.300 | 0.062 | 0.028 | 0.028 |
| VOMITO | 0.014 | 0.085 | 0.003 | 0.081 | 0.011 | 0.386 | 0.376 | 0.353 | 0.343 | 0.402 | 0.365 | 0.639 | 1.000 | 0.472 | 0.441 | 0.350 | 0.350 | 0.309 | 0.058 | 0.051 | 0.031 |
| DOR_ABD | 0.019 | 0.040 | 0.000 | 0.081 | 0.016 | 0.275 | 0.274 | 0.518 | 0.339 | 0.302 | 0.277 | 0.453 | 0.472 | 1.000 | 0.504 | 0.524 | 0.519 | 0.256 | 0.066 | 0.049 | 0.057 |
| FADIGA | 0.026 | 0.115 | 0.006 | 0.128 | 0.019 | 0.287 | 0.291 | 0.423 | 0.310 | 0.350 | 0.316 | 0.438 | 0.441 | 0.504 | 1.000 | 0.479 | 0.475 | 0.277 | 0.060 | 0.085 | 0.067 |
| PERD_OLFT | 0.028 | 0.065 | 0.000 | 0.086 | 0.009 | 0.229 | 0.231 | 0.506 | 0.280 | 0.256 | 0.236 | 0.351 | 0.350 | 0.524 | 0.479 | 1.000 | 0.731 | 0.257 | 0.059 | 0.061 | 0.063 |
| PERD_PALA | 0.028 | 0.060 | 0.000 | 0.090 | 0.011 | 0.230 | 0.231 | 0.501 | 0.278 | 0.257 | 0.240 | 0.350 | 0.350 | 0.519 | 0.475 | 0.731 | 1.000 | 0.261 | 0.063 | 0.056 | 0.066 |
| OUTRO_SIN | 0.023 | 0.039 | 0.000 | 0.164 | 0.039 | 0.204 | 0.213 | 0.232 | 0.176 | 0.206 | 0.208 | 0.300 | 0.309 | 0.256 | 0.277 | 0.257 | 0.261 | 1.000 | 0.112 | 0.044 | 0.053 |
| VACINA | 0.067 | 0.093 | 0.013 | 0.187 | 0.073 | 0.043 | 0.048 | 0.077 | 0.039 | 0.040 | 0.070 | 0.062 | 0.058 | 0.066 | 0.060 | 0.059 | 0.063 | 0.112 | 1.000 | 0.138 | 0.061 |
| VACINA_COV | 0.070 | 0.487 | 0.021 | 0.135 | 0.051 | 0.120 | 0.114 | 0.077 | 0.010 | 0.081 | 0.035 | 0.028 | 0.051 | 0.049 | 0.085 | 0.061 | 0.056 | 0.044 | 0.138 | 1.000 | 0.261 |
| CLASSI_FIN | 0.137 | 0.236 | 0.019 | 0.157 | 0.061 | 0.087 | 0.104 | 0.083 | 0.065 | 0.094 | 0.053 | 0.028 | 0.031 | 0.057 | 0.067 | 0.063 | 0.066 | 0.053 | 0.061 | 0.261 | 1.000 |
| DT_SIN_PRI | SEM_PRI | ESTRANG | NU_IDADE_N | SG_UF | CS_ZONA | OUT_ANIM | FEBRE | TOSSE | GARGANTA | DISPNEIA | DESC_RESP | SATURACAO | DIARREIA | VOMITO | DOR_ABD | FADIGA | PERD_OLFT | PERD_PALA | OUTRO_SIN | OUTRO_DES | VACINA | VACINA_COV | DOSE_2REF | CLASSI_FIN | CLASSI_OUT | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 17/01/2023 | 3 | 2.0 | 75 | MG | 1.0 | NaN | 1.0 | 1.0 | 2.0 | 1.0 | 1.0 | 1.0 | 2.0 | 1.0 | 2.0 | 1.0 | 2.0 | 2.0 | 2.0 | NaN | 1.0 | 1.0 | 11/04/2022 | 4.0 | NaN |
| 1 | 01/01/2023 | 1 | 2.0 | 67 | RJ | 1.0 | NaN | 1.0 | 9.0 | 9.0 | 9.0 | 9.0 | 9.0 | 9.0 | 9.0 | 9.0 | 9.0 | 9.0 | 9.0 | 9.0 | NaN | 9.0 | 1.0 | 10/05/2022 | 4.0 | NaN |
| 2 | 05/01/2023 | 1 | NaN | 72 | SP | 1.0 | NaN | 2.0 | 1.0 | 2.0 | 1.0 | 2.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | NaN | NaN | 1.0 | 19/04/2022 | 5.0 | NaN |
| 3 | 18/01/2023 | 3 | 2.0 | 46 | SP | 1.0 | NaN | 1.0 | 2.0 | NaN | 1.0 | 1.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 1.0 | FRAQUEZA,MAL ESTAR,MIALGIA | 2.0 | 1.0 | NaN | 4.0 | NaN |
| 4 | 03/02/2023 | 5 | 2.0 | 71 | SP | NaN | NaN | NaN | 1.0 | 2.0 | 1.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | NaN | NaN | 1.0 | 1.0 | 25/05/2022 | 4.0 | NaN |
| 5 | 02/02/2023 | 5 | 2.0 | 7 | RJ | 1.0 | NaN | 2.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 1.0 | CORIZA | 2.0 | 1.0 | NaN | 4.0 | NaN |
| 6 | 25/01/2023 | 4 | 2.0 | 1 | BA | 1.0 | NaN | 1.0 | 2.0 | 2.0 | 1.0 | 1.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | NaN | 9.0 | 2.0 | NaN | 4.0 | NaN |
| 7 | 22/02/2023 | 8 | 2.0 | 4 | PR | 9.0 | NaN | NaN | 1.0 | NaN | 1.0 | NaN | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | CORIZA | NaN | 2.0 | NaN | 2.0 | NaN |
| 8 | 09/02/2023 | 6 | 2.0 | 8 | AL | NaN | NaN | 1.0 | 1.0 | 2.0 | 1.0 | 1.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 1.0 | CORIZA | NaN | 1.0 | NaN | 4.0 | NaN |
| 9 | 03/01/2023 | 1 | 2.0 | 86 | PR | 2.0 | NaN | 1.0 | 2.0 | NaN | 1.0 | 1.0 | 1.0 | 2.0 | 2.0 | 2.0 | 1.0 | 2.0 | 2.0 | NaN | NaN | NaN | 1.0 | 18/04/2022 | 4.0 | NaN |
| DT_SIN_PRI | SEM_PRI | ESTRANG | NU_IDADE_N | SG_UF | CS_ZONA | OUT_ANIM | FEBRE | TOSSE | GARGANTA | DISPNEIA | DESC_RESP | SATURACAO | DIARREIA | VOMITO | DOR_ABD | FADIGA | PERD_OLFT | PERD_PALA | OUTRO_SIN | OUTRO_DES | VACINA | VACINA_COV | DOSE_2REF | CLASSI_FIN | CLASSI_OUT | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 226753 | 11/08/2023 | 32 | 2.0 | 79 | CE | 1.0 | NaN | 2.0 | 1.0 | 2.0 | 2.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | NaN | NaN | 2.0 | NaN | 4.0 | NaN |
| 226754 | 28/09/2023 | 39 | 2.0 | 40 | SP | 1.0 | NaN | 2.0 | 1.0 | 2.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | NaN | NaN | 2.0 | NaN | 5.0 | NaN |
| 226755 | 10/09/2023 | 37 | 2.0 | 7 | DF | 3.0 | NaN | 1.0 | 1.0 | NaN | NaN | NaN | 1.0 | NaN | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 | NaN | 5.0 | NaN |
| 226756 | 16/09/2023 | 37 | 2.0 | 2 | CE | 1.0 | NaN | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | NaN | NaN | NaN | 1.0 | NaN | NaN | NaN |
| 226757 | 15/09/2023 | 37 | 2.0 | 3 | MS | 1.0 | NaN | 1.0 | 1.0 | 2.0 | 1.0 | 1.0 | 1.0 | 2.0 | 2.0 | 2.0 | 1.0 | 2.0 | 2.0 | 1.0 | CONG. NASAL.RINORREIA.PROSTACA | 9.0 | 2.0 | NaN | 4.0 | NaN |
| 226758 | 17/09/2023 | 38 | 2.0 | 3 | SP | 1.0 | NaN | 1.0 | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 | NaN | 5.0 | NaN |
| 226759 | 10/07/2023 | 28 | 2.0 | 5 | PR | NaN | NaN | 2.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 1.0 | ESFORCO RESP, ROUQUIDAO | 9.0 | 2.0 | NaN | 4.0 | NaN |
| 226760 | 30/07/2023 | 31 | 2.0 | 2 | RS | 1.0 | NaN | 2.0 | 1.0 | 2.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 1.0 | DIFICULDADE PARA MAMAR | 9.0 | 2.0 | NaN | 2.0 | NaN |
| 226761 | 11/09/2023 | 37 | NaN | 11 | DF | 1.0 | NaN | 1.0 | 1.0 | NaN | 1.0 | NaN | 1.0 | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 | NaN | 4.0 | NaN |
| 226762 | 04/10/2023 | 40 | 2.0 | 20 | SP | 1.0 | NaN | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 1.0 | PARTO CESAREA ASSINTOMATICA | 2.0 | 1.0 | NaN | 5.0 | NaN |
Most frequently occurring
| DT_SIN_PRI | SEM_PRI | ESTRANG | NU_IDADE_N | SG_UF | CS_ZONA | OUT_ANIM | FEBRE | TOSSE | GARGANTA | DISPNEIA | DESC_RESP | SATURACAO | DIARREIA | VOMITO | DOR_ABD | FADIGA | PERD_OLFT | PERD_PALA | OUTRO_SIN | OUTRO_DES | VACINA | VACINA_COV | DOSE_2REF | CLASSI_FIN | CLASSI_OUT | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 21 | 01/05/2023 | 18 | 2.0 | 26 | BA | 1.0 | NaN | 2.0 | 2.0 | 2.0 | 1.0 | 1.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | NaN | NaN | 2.0 | NaN | 2.0 | NaN | 4 |
| 26 | 01/06/2023 | 22 | 2.0 | 2 | SP | 3.0 | NaN | 2.0 | 1.0 | NaN | 1.0 | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 | NaN | 4.0 | NaN | 4 |
| 37 | 01/09/2023 | 35 | NaN | 57 | DF | 1.0 | NaN | 1.0 | 1.0 | 2.0 | 1.0 | 1.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | NaN | 9.0 | 1.0 | 03/06/2022 | 1.0 | NaN | 4 |
| 40 | 01/10/2023 | 40 | 2.0 | 64 | SP | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 9.0 | 1.0 | 07/06/2022 | 5.0 | NaN | 4 |
| 41 | 02/01/2023 | 1 | 2.0 | 34 | PR | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | NaN | 4.0 | NaN | 4 |
| 44 | 02/02/2023 | 5 | NaN | 85 | PE | 1.0 | NaN | NaN | 1.0 | NaN | 1.0 | 1.0 | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | NaN | 5.0 | NaN | 4 |
| 60 | 02/07/2023 | 27 | 2.0 | 5 | PA | 2.0 | NaN | 1.0 | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 | 2.0 | NaN | 4.0 | NaN | 4 |
| 61 | 02/07/2023 | 27 | 2.0 | 43 | TO | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | NaN | 5.0 | NaN | 4 |
| 62 | 02/07/2023 | 27 | 2.0 | 90 | TO | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 | NaN | 5.0 | NaN | 4 |
| 76 | 03/05/2023 | 18 | 2.0 | 51 | SC | 1.0 | NaN | 1.0 | 1.0 | 1.0 | 1.0 | 2.0 | NaN | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | NaN | NaN | NaN | 1.0 | NaN | 5.0 | NaN | 4 |